Publications
Filters
total: 348
Catalog Publications
Year 2011
-
Content-Based Approach to Automatic Recommendation of Music
PublicationThis paper presents a content-based approach to music recommendation. For this purpose, a database which contains more than 50000 music excerpts acquired from public repositories was built. Datasets contain tracks of distinct performers within several music genres. All music pieces were converted to mp3 format and then parameterized based on MPEG-7, mel-cepstral and time-related dedicated parameters. All feature vectors are stored...
-
Intelligent multimedia solutions supporting special education needs.
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
Intelligent video and audio applications for learning enhancement
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
Year 2015
-
"Creating a numerical model of noise conditions based on the analysis of traffic volume changes in cities with low and medium structure.
PublicationThe subject of this research study is to analyze noise conditions of the selected area in the city of Gdańsk using data related to traffic volume changes during a day. This is because daily distribution of noise levels is much more helpful for noise control and reduction than traditional maps with Lden levels indicated. Calculations are made with the use of a numerical model developed at the Gdansk Univ. of Technology and implemented...
-
"Creating a numerical model of noise conditions based on the analysis of traffic volume changes in cities with low and medium structure.
PublicationThe subject of this research study is to analyze noise conditions of the selected area in the city of Gdańsk using data related to traffic volume changes during a day. This is because daily distribution of noise levels is much more helpful for noise control and reduction than traditional maps with Lden levels indicated. Calculations are made with the use of a numerical model developed at the Gdansk Univ. of Technology and implemented...
-
Development of the sound field 3D intensity probe based on miniature microphones
PublicationThe engineered measuring probe uses three pairs of miniature microphones coupled. The signals from the microphones after an initial amplification are fed to differential circuits. Due to the required symmetry of the circuit it was necessary to select electronic components very carefully. Moreover, additional digital signal processing techniques were applied to avoid amplitude and phase mismatch. The view of the engineered probe...
-
Dopasowanie charakterystyki dynamiki dźwięku do preferencji słuchowych użytkownika urządzeń mobilnych
PublicationW celu określenia preferowanej charakterystyki dynamiki generowanych dźwięków należy uzyskać informację, w jaki sposób użytkownik postrzega głośność dźwięków o różnym poziomie dźwięku. Poruszany problem należy rozpatrywać oddzielnie dla dwóch grup użytkowników – osób słyszących prawidłowo oraz osób z ubytkiem słuchu. W pierwszym przypadku należy zadbać o to, aby wyznaczona charakterystyka dynamiki właściwie przetwarzała dźwięki...
-
Evaluation of a Novel Approach to Virtual Bass Synthesis Strategy
PublicationThe aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) strategy applied to portable computers. The developed algorithms involve intelligent, rule-based settings of bass synthesis parameters with regard to music genre of an audio excerpt and the type of a portable device in use. The Smart VBS algorithm performs the synthesis based on a nonlinear device (NLD) with artificial controlling synthesis...
-
GRAPHICAL REPRESENTATION OF MUSIC SET BASED ON MOOD OF MUSIC. GRAFICZNA PREZENTACJA ZBIORU MUZYCZNEGO OPARTA NA ANOTACJI NASTROJU MUZYKI
PublicationOne of the features for music recommendation, which is useful and intuitive for music listen-ers, is “mood”. The paper presents an approach to graphical representation of mood of music pieces. Subjective evaluation based on listening tests is performed for assigning mood labels of 150 pieces of music and placing them on the 2D mood plane. As a result, a map of songs is created, where music excerpts with similar mood are organized...
-
Knowledge representation of motor activity of patients with Parkinson’s disease
PublicationAn approach to the knowledge representation extraction from biomedical signals analysis concerning motor activity of Parkinson disease patients is proposed in this paper. This is done utilizing accelerometers attached to their body as well as exploiting video image of their hand movements. Experiments are carried out employing artificial neural networks and support vector machine to the recognition of characteristic motor activity...
-
Loudness Scaling Tests in Hearing Problems Detection
PublicationThe number of people using portable audio players has increased significantly over the recent years. This implies the rise in the number of people having hearing loss problems. Therefore, there is a need to find appropriate procedures that simplify the process of the hearing problem detection. Investigations performed show that audiometric tests may not be sufficient to assess hearing in young people. Contrarily, the obtained results...
-
Measurements and visualization of sound field distribution around organ pipe
PublicationMeasurements and visualization of acoustic field around an organ pipe are presented. Sound intensity technique was applied for this purpose. Measurements were performed in free field. The organ pipe was activated with a constant air flow, produced by an external compressor, aimed at obtaining long-term steady state responses of generated acoustic signal. Sound energy distribution was measured in a defined fixed grid of points...
-
Measurements and Visualization of Sound Intensity Around the Human Head in Free Field Using Acoustic Vector Sensor
PublicationThis paper presents measurements and visualization of sound intensity around the human head simulator in a free field. A Cartesian robot, applied for precise positioning of the acoustic vector sensor, was used to measure sound intensity. Measurements were performed in a free field using a head and torso simulator and the setup consisting of four different loudspeaker configurations. The acoustic vector sensor was positioned around...
-
Measuring and Analyzing Audio Levels in Film, Commercials, and Movie Trailers Using Leq(A) Values and the LUFS Loudness Model . Analiza pomiarów dźwięku w filmie oraz w reklamach filmowych z wykorzystaniem modelu głośności
PublicationThe purpose of this paper is to describe the measurement of loudness levels in movies, movie trailers, and commercials displayed before feature films at movie theaters. In the initial section, the paper discusses the issues related to measurement of loudness levels, provides recommendations regarding permissible loudness levels during movie screenings, and mentions the applied units of measurement. The following section of the...
Year 2014
-
Creating a Realible Music Discovery and Recomendation System
PublicationThe aim of this paper is to show problems related to creating a reliable music dis-covery system. The SYNAT database that contains audio files is used for the purpose of experiments. The files are divided into 22 classes corresponding to music genres with different cardinality. Of utmost importance for a reliable music recommendation system are the assignment of audio files to their appropriate gen-res and optimum parameterization...
-
Eye-Gaze Tracking-Based Telepresence System for Videoconferencing
PublicationAn approach to the teleimmersive videoconferencing system enhanced by the pan-tilt-zoom (PTZ) camera, controlled by the eye-gaze tracking system, is presented in this paper. An overview of the existing telepresence systems, especially dedicated to videoconferencing is included. The presented approach is based on the CyberEye eye-gaze tracking system engineered at the Multimedia Systems Department (MSD) of Gdańsk University of Technology...
-
Frequently updated noise threat maps created with use of supercomputing grid
PublicationAn innovative supercomputing grid services devoted to noise threat evaluation were presented. The services described in this paper concern two issues, first is related to the noise mapping, while the second one focuses on assessment of the noise dose and its influence on the human hearing system. The discussed services were developed within the PL-Grid Plus Infrastructure which accumulates Polish academic supercomputer centers....
-
Inteligentna Synteza Niskich Częstotliwości w urządzeniach mobilnych
PublicationW pracy przedstawiono algorytm inteligentnej adaptacji parametrów syntezy niskich częstotliwości w urządzeniach przenośnych w zależności od odtwarzanego gatunku muzycznego (Smart VBS). Proponowany algorytm wykorzystuje metody generacji harmonicznych oparte na generatorze funkcji nieliniowych (NLD) i wokoderze fazowym (PV). Dla znalezienia optymalnych parametrów syntezy przeprowadzono testy subiektywne sprawdzające powiązanie parametrów...
Year 2022
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublicationThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
Year 2013
-
Creating Dynamic Maps of Noise Threat Using PL-Grid Infrastructure
PublicationThe paper presents functionality and operation results of a system for creating dynamic maps of acoustic noise employing the PL-Grid infrastructure extended with a distributed sensor network. The work presented provides a demonstration of the services being prepared within the PLGrid Plus project for measuring, modeling and rendering data related to noise level distribution in city agglomerations. Specific computational environments,...
-
Creating dynamic maps of noise threat using pl-grid infrastructure; materiały konferencyjne
PublicationThis paper presents functionality and operation results of the system for creating dynamic maps of noise thread with the use of the PL-Grid infrastructure integrated with distributed sensors network for measuring, modeling and rendering noise level distribution. The work presented provides a demonstration of the services being prepared within the PLGrid Plus project. Specific computational environments, so called domain grids,...
-
Examining Classifiers Applied to Static Hand Gesture Recognition in Novel Sound Mixing System
PublicationThe main objective of the chapter is to present the methodology and results of examining various classifiers (Nearest Neighbor-like algorithm with non-nested generalization (NNge), Naive Bayes, C4.5 (J48), Random Tree, Random Forests, Artificial Neural Networks (Multilayer Perceptron), Support Vector Machine (SVM) used for static gesture recognition. A problem of effective gesture recognition is outlined in the context of the system...
-
Gesture-controlled Sound Mixing System With a Sonified Interface
PublicationIn this paper the Authors present a novel approach to sound mixing. It is materialized in a system that enables to mix sound with hand gestures recognized in a video stream. The system has been developed in such a way that mixing operations can be performed both with or without visual support. To check the hypothesis that the mixing process needs only an auditory display, the influence of audio information visualization on sound...
-
In uence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classi cation
PublicationWe present a comprehensive evaluation of the infuence of 'harmonic' and rhythmic sections contained in an audio file on automatic music genre classi cation. The study is performed using the ISMIS database composed of music files, which are represented by vectors of acoustic parameters describing low-level music features. Non-negative Matrix Factorization serves for blind separation of instrument components. Rhythmic components...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublicationThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
Low-Level Music Feature Vectors Embedded as Watermarks
PublicationIn this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...
Year 2002
-
Decomposition of duet instrument sounds. W: [CD-ROM] International Sympo-sium of Musical Acoustics. ISMA MEXICO CITY. Mexico City, 9-13 December 2002. Mexico City: Escuela Nacional de Musica UNAM**2002, 10 s. 4 rys. 2 tab. bibliogr. 15 poz. Dekompozycja duetów muzycznych.
PublicationW referacie zaprezentowany został algorytm separacji nagrań duetów muzycz-nych. Metoda separacji oparta została na algorytmie FED, przy pomocy któregomożliwa jest ekstrakcja części harmonicznych sygnałów. Ponadto wykorzystanyzostał algorytm estymacji częstotliwości podstawowej oparty na korelacjiskrośnej, w celu estymacji częstotliwości dekomponowanych harmonicznych.
-
Digital waveguide models of the panpipes
PublicationW artykule przedstawiono główne cechy syntezy falowodowej. Omówiono cechy instrumentu fletni Pana. Przedyskutowano cechy zaproponowanych dwóch modeli fletni Pana różniących się złożonością obliczeniową. Pokazano szczegóły implementacyjne tych modeli, a także uzyskane wyniki symulacji dźwięków w modelach. Dokonano porównania dźwięków rzeczywistych i uzyskanych w wyniku syntezy falowodowej.
-
Expert media approach to hearing aids fitting
PublicationW artykule zaprezentowano problematykę dopasowania protez słuchu. Przedstawiono system ekspercki, który pozwala na znalezienie charakterystyk aparatu słuchowego adekwatnego do uszkodzenia słuchu. System został oparty o metodę zbiorów przybliżonych i logikę rozmytą.
Year 2017
-
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
PublicationAn allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
-
Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów
PublicationThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Intelligent equalizer solution employing music genre and the room characteristics analysis
PublicationThe paper presents an intelligent equalizer solution based on room acoustic conditions and music genre analysis. A series of acoustic characteristic measurements are performed for checking the concept proposed. White noise (reference signal) and audio excerpts belonging to six music genres are utilized as excitation signals in measurements. This results in registration of frequency responses of rooms and reverberation times. Signals...
-
Measurement and visualization of sound intensity vector distribution in proximity of acoustic diffusers
PublicationIn this work, we would like to present analyses and visualizations of sound intensity distribution measured in proximity of an acoustic diffuser. Such distribution may be used for estimation of basic acoustic parameters of a diffuser. Measurement is performed with the use of a logarithmic sine sweep which allows for the analysis of waves scattered by the diffuser and rejecting the direct sound signal component. Pressure and sound...
-
METODA OCENY EFEKTYWNOŚCI KRÓTKOTERMINOWEGO STOSOWANIA APARATÓW SŁUCHOWYCH Z WYKORZYSTANIEM APLIKACJI INTERNETOWEJ
PublicationW pracy przedstawiono opracowanie metody oceny efektywności protezowania osób niedosłyszących aparatami słuchowymi. Metoda polega na badaniu ankietowym opartym na kwestionariuszu oceny APHAB uzupełnionym testem rozumienia słów jednosylabowych w polu swobodnym. Uwzględniono dodatkowe kryteria, takie jak: stopień ubytku słuchu, pomiar liczby dni i godzin korzystania z aparatów słuchowych oraz doświadczenia pacjenta. Metoda została...
Year 2007
-
Determining the noise impact on hearing using psychoacoustical noise dosimeter
PublicationThis research study presents the designed noise dosimeter based on psychoacoustical properties of the human hearing system and, at the same time. evaluation of time and frequency characteristics of noise. The designed noise dosimeter enables assessing temporary threshold shift (TTS) in critical hands in real time. In this way it is possible monitoring the hearing threshold shift continuously for people who stay in the harmful noise...
Year 2006
-
Digital hearing aid with time and spectral transposition.
PublicationNastępstwem uruchomienia w Polsce, prowadzonych na szeroką skalę, badań przesiewowych słuchu jest konieczność zaoferowania pomocy osobom cierpiącym na niedosłuch poprzez leczenie i protetykę słuchu. Tymczasem, aktualnie oferowane rozwiązania aparatów słuchowych nie są w stanie sprostać niektórym specjalistycznym potrzebom aparatowania, m. in.: najmłodszych dzieci, osób pracujących w hałasie, pilotów wojskowych oraz osób korzystających...
-
Dithering strategy applied to tinnitus masking.
PublicationW referacie przedstawiono teorię wyjaśniającą zjawisko szumów usznych na gruncie akustyki, elektroniki i telekomunikacji. Spostrzeżenie, że słuch jest w istocie akustycznym układem transmisyjnym, skłania do poszukiwania interpretacji powstawania szumów usznych w ogólnej teorii spontanicznego generowania szumu w układach transmisyjnych. Sformułowana hipoteza wskazuje na istnienie pasożytniczej kwantyzacji, która pojawia się w sytuacji...
-
Hearing aid operating in acoustical free field
PublicationAparatowanie bardzo małych dzieci (od 5 miesiąca życia) za pomocą standardowych protez słuchu natrafia na wiele trudności natury praktycznej. Dotyczy to procesu dopasowania aparatu słuchowego, czyli doboru jego ustawień stosownie do aktualnych charakterystyk ubytku słuchu dzieci. Tymczasem wczesne aparatowanie jest zagadnieniem o ogromnym zanczeniu dla rozwoju słuchu, mowy i ogólnej inteligencji dziecka. Referat prezentuje uzyskane...
-
Investigation of Noise Threats and Their Impact on Hearing in Selected Schools
PublicationNoise measurements conducted in selected schools in Gdansk area are presented in this paper. The main aim of this research was to determine noise threats at schools. Some objective measurements of the acoustic climate were performed employing a noise monitoring station engineered at the Multimedia System Department, Gdansk University of Technology. Simultaneously, subjective noise annoyance examinations were carried out among pupils...
-
Investigation of noise threats and their impact on hearing in selected schools - a pilot study.
PublicationNoise measurements conducted in selected schools in Gdansk area are presented in this paper. The main aim of this research was to determine noise threats at schools. Some objective measurements of the acoustic climate were performed employing a noise monitoring station engineered at the Multimedia System Department, Gdansk University of Technology. Simultaneously, subjective noise annoyance examinations were carried out among pupils...
-
It applications for the remote testing of hearing.
PublicationTelemedycyna odgrywa coraz wiekszą rolę w diagnostyce i leczeniu osób z ubytkami słuchu. Jest to związane m.in. ze specyfiką badań audiometrycznych. Postęp technologiczny w dziedzinie aparatów słuchowych i implantów ślimakowych wymusza nowe metody diagnozy w audiologii, jak również w praktyce otolaryngologicznej. Serwis ''Telezdrowie'', w którym zaimplementowano liczne testy przesiewowe jest przykładem prowadzenia diagnostyki w...
Year 2019
-
Discovering Rule-Based Learning Systems for the Purpose of Music Analysis
PublicationMusic analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...
-
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
PublicationWe present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...
-
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
PublicationAutomatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublicationA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
Year 2012
-
Distributed System For Noise Threat Evaluation Based On Psychoacoustic Measurements
PublicationAn innovative system designed for the continuous monitoring of acoustic climate of urban areas was presentedin the paper. The assessment of environmental threats is performed using online data, acquired through a grid ofengineered monitoring stations collecting comprehensive information about the acoustic climate of urban areas.The grid of proposed devices provides valuable data for the purpose of long and short time acoustic climateanalysis....
-
Editor's Farewell
PublicationBy this occasion, I would like to mention the major milestones Archives of Acoustics experienced during the last years. For some years, we concentrated our efforts on introducing Archives of Acoustics to the ISI Web of Knowledge and the Journal Citation Report databases.We achieved this aim, and since 2007 Archive of Acoustics has been referenced in the Journal Citation Report. Accordingly, our next object was to obtain the Impact...
-
Employing a biofeedback method based on hemispheric synchronization in effective learning
PublicationIn this paper an approach to build a brain computer-based hemispheric synchronization system is presented. The concept utilizes the wireless EEG signal registration and acquisition as well as advanced pre-processing methods. The influence of various filtration techniques of EOG artifacts on brain state recognition is examined. The emphasis is put on brain state recognition using band pass filtration for separation of individual...
-
Hand gesture recognition supported by fuzzy rules and Kalman filters
PublicationThe paper presents a system based on camera and multimediaprojector enabling a user to control computer applications by dynamic hand gestures. Gesture recognition methodology based on representing hand movement trajectory by motion vectors analysed using fuzzy rule-based inference is first given. For effective hand position tracking Kalman filters are employed. The system engineered is developed using J2SE and C++/OpenCV technology....
Year 2018
-
Editor's note and 2018 reviewers
PublicationPrzedmiotem pracy jest odniesienie do prac opublikowanych w 2018 roku, jak również do serii artykułów w ramach specjalnego wydania: Special Issue on Augmented and Participatory Sound and Music Interaction Using Semantic Audio.
-
Eksternalizacja w binauralnej ambisonicznej auralizacji źródeł kierunkowych
PublicationW artykule przedstawiono najważniejsze składniki procesu skutecznego renderowania trójwymiarowego obrazu dźwiękowego za pomocą słuchawek. W tym celu badany jest stopień oddziaływania poszczególnych czynników wpływających na eksternalizację dźwięku: śledzenie położenia głowy (ang. head tracking), indywidualne funkcje przenoszenia głowy (HRTF – Head Related Transfer Function, odnoszące się do matematycznej funkcji propagacji dźwięku...
-
EVALUATION OF SOUND QUALITY FEATURES ON ENVIRONMENTAL NOISE EFFECTS – A CASE STUDY APPLIED TO ROAD TRAFFIC NOISE
PublicationThe paper shows a study on the relationship between noise measures and sound quality (SQ) features that are related to annoyance caused by the traffic noise. First, a methodology to perform analyses related to the traffic noise annoyance is described including references to parameters of the assessment of road noise sources. Next, the measurement setup, location and results are presented along with the derived sound quality features....
-
Examining Feature Vector for Phoneme Recognition
PublicationThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Improving the quality of speech in the conditions of noise and interference
PublicationThe aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...
-
In Memoriam Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering
PublicationBiography and scientific achievements of Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering.
-
INFLUENCE OF DATA NORMALIZATION ON THE EFFECTIVENESS OF NEURAL NETWORKS APPLIED TO CLASSIFICATION OF PAVEMENT CONDITIONS – CASE STUDY
PublicationIn recent years automatic classification employing machine learning seems to be in high demand for tele-informatic-based solutions. An example of such solutions are intelligent transportation systems (ITS), in which various factors are taken into account. The subject of the study presented is the impact of data pre-processing and normalization on the accuracy and training effectiveness of artificial neural networks in the case...
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationMuch attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
-
Listening to Live Music: Life beyond Music Recommendation Systems
PublicationThis paper presents first a short review on music recommendation systems based on social collaborative filtering. A dictionary of terms related to music recommendation systems, such as music information retrieval (MIR), Query-by-Example (QBE), Query-by-Category (QBC), music content, music annotating, music tagging, bridging the semantic gap in music domain, etc. is introduced. Bases of music recommender systems are shortly presented,...
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublicationThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
Year 2020
-
Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres
PublicationThe purpose of this research is two-fold: (a) to explore the relationship between the listeners’ personality trait, i.e., extraverts and introverts and their preferred music genres, and (b) to predict the personality trait of potential listeners on the basis of a musical excerpt by employing several classification algorithms. We assume that this may help match songs according to the listener’s personality in social music networks....
-
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
PublicationThe Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...
-
Improving Objective Speech Quality Indicators in Noise Conditions
PublicationThis work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationThe study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...
Year 2009
-
Enhancement of computer character animation utilizing fuzzy rules
PublicationRozdział przedstawia nową metodę przetwarzania komputerowych animacji postaci. Wykorzystuje ona wnioskowanie rozmyte, oparte na regułach i funkcjach przynależności uzyskanych w procesie analizy wyników testów subiektywnej oceny jakości animacji. W trakcie przetwarzania do animacji automatycznie dodawane są nowe fazy ruchu, co skutkuje poprawą jakości wizualnej oraz zmianą płynności i stylizacji ruchu w sposób zamierzony. W referacie...
-
Gesture recognition framework for multimedia content viewer controlling
PublicationIn the paper a system for controlling a multimedia content viewer by hand gestures is presented. First, selected methods used for gesture recognition are described. Two different application cases of the system, i.e. for multimedia presentation purposes and for multimedia content viewing are outlined. Moreover, a proposal of improvement of the system combining these approaches is also given. The system work cycle is reviewed. The...
-
Human-computer interaction approach applied to the multimedia system of polysensory integration
PublicationIn the paper an approach of utilizing an interaction between the human and computer in a therapy of dyslexia and other sensory disorders is presented. Bakker's neuropsychological concept of dyslexia along with therapy methods are reviewed in the context of the Multimedia System of Polysensory Integration, proposed at the Multimedia Systems Department of Gdansk Univ. of Technology. The system is presented along with the training...
Year 2005
-
Estimation of musical sound separation algorithm effectiveness employing neural networks.
PublicationŚlepa separacja dźwięków sygnałów muzycznych zawartych w zmiksowanym materiale jest trudnym zadaniem. Jest to spowodowane tym, że dźwięki znajdujące się w relacjach harmonicznych mogą zawierać kolidujące składowe sinusoidalne (składowe harmoniczne). Ewaluacja wyników separacji jest również problematyczna, gdyż analiza błędu energetycznego często nie odzwierciedla subiektywnej jakości odseparowanych sygnałów. W tej publikacji zostały...
-
Estimation the rhythmic salience of sound with association rules and neural networks
PublicationW referacie przedstawiono eksperymenty mające na celu automatyczne wyszukiwanie wartości rytmicznych we frazie muzycznej. W tym celu wykorzystano metody data mining i sztuczne sieci neuronowe.
-
Implementacja reguł animacji w logice rozmytej
PublicationZaprojektowano system komputerowy wspomagający tworzenie animacji. System wykorzystuje reguły animacji wywodzące się z animacji tradycyjnej. Reguły opisują sposób uzyskiwania animacji postaci nacechowanych emocjonalnie. Na potrzeby badań zostały one sformułowane w logice rozmytej i zaimplementowane w języku programowania Python. Wykorzystując system wygenerowano animacje testowe, które poddano ocenie subiektywnej, w celu określenia...
-
Intelligent multimedia asplications - scanning the issue
PublicationCelem specjalnego wydania tego tomu czasopisma JIIS, zatytułowanego ''Inteligentne przetwarzanie multimediów'', było przedstawienie badań w tej dziedzinie, prowadzonych w różnych ośrodkach na świecie. Zawarte w tym tomie artykuły dotyczyły inteligentnego przetwarzania sygnałów fonicznych i wideo, jak również muzyki.
-
Intelligent system for environmental noise monitoring.
PublicationW rozdziale przedstawiono projekt i realizację automatycznej stacji monitorowania hałasu środowiskowego. Stanowi ona jeden z elementów tworzonego w Katedrze Systemów Multimedialnych Politechniki Gdańskiej Multimedialnego Systemu Monitorowania Hałasu. Przedstawiono ogólną budowę stacji pomiarowej oraz omówiono jej podstawową funkcjonalność. Obszerniej opisano dodatkowe możliwości stacji, do których należą: komunikacja z wykorzystaniem...
-
Internet-based automatic hearing assessment system
PublicationCelem referatu jest prezentacja systemu przesiewowego badania słuchu w oparciu o Internet. Wady słuchu stanowią jedną z najszybciej postępujących chorób we współczesnym społeczeństwie. W tym kontekście ważne staje się umożliwienie przeprowadzania masowych testów wykrywających ubytki słuchu. Przedstawiona aplikacja zawiera audiometryczny test tonalny, test ilustrowany dla dzieci oraz test rozumienia mowy w szumie. Po zakończeniu...
-
Machine learning system for estimating the rhythmic salience of sounds.
PublicationW artykule przedstawiono badania dotyczące wyszukiwania danych rytmicznych w muzyce. W pracy przedstawiono postać funkcji rankingujacej poszczególnych dźwięków frazy muzycznej. Opracowano metodę tworzenia wszystkich możliwych hierarchicznych struktur rytmicznych, zwanych hipotezami rytmicznymi. Otrzymane hipotezy są następnie porządkowane w kolejności malejącej wartości funkcji rankingującej, aby ustalić, która ze znalezionych...
Year 2008
-
Evaluation of excessive noise effects on hearing employing psychoacoustic dosimetry
PublicationResearch results regarding the noise impact on hearing applying the concept of the Psychoacoustic Noise Dosimetry (PND) are presented. The general characteristics of the PND algorithm are discussed. Additionally, the results of hearing examinations conducted in the laboratory conditions are shown. The main objective of the research was to determine the time needed for the Temporary Threshold Shift to reverse. The results were used...
-
Hearing aid fitting method based on fuzzy logic processing
PublicationWażnym etapem dopasowania współczesnych aparatów słuchowych jest wyznaczanie charakterystyki dynamiki słuchu. Charakterystyka ta wyznaczana jest na podstawie wyników testu skalowania głośności. Niestety wyniki te wyrażone są w skali kategorii głośności, natomiast aparaty słuchowe wymagają para-metrów numerycznych. Problem ten można rozwiązać za pomocą logiki rozmytej. W niniejszym referacie przedstawiono metodę przetwarzania rozmytego...
-
Hearing aid fitting method based on fuzzy logic processing
PublicationWażnym etapem dopasowania współczesnych aparatów słuchowych jest wyznaczanie charakterystyki dynamiki słuchu. Charakterystyka ta wyznaczana jest na podstawie wyników testu skalowania głośności. Niestety wyniki te wyrażone są w skali kategorii głośności, natomiast aparaty słuchowe wymagają para-metrów numerycznych. Problem ten można rozwiązać za pomocą logiki rozmytej. W niniejszym referacie przedstawiono metodę przetwarzania rozmytego...
Year 2010
-
Evaluation of the separation algorithm performance employing ANNs
PublicationCelem niniejszego rozdziału jest przedstawienie metodyki separacji dźwięków muzycznych bez informacji a priori o dźwiękach zawartych w muzycznym miksie. W pracy pokazano, że prawidłowo wytrenowana sztuczna sieć neuronowa (SNN)jest w stanie w sposób automatyczny poprawnie sklasyfikować dźwięki zawarte w zmiksowanym sygnale. Skuteczność klasyfikacji SNN jest porównywalna z oceną subiektywną ekspertów.
-
Exploiting audio-visual correlation by means of gaze tracking
PublicationThis paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...
-
Fuzzy rule-based dynamic gesture recognition employing camera & multimedia projector
PublicationIn the paper the system based on camera and multimedia projector enabling a user to control computer applications by dynamic hand gestures is presented. The main objective is to present the gesture recognition methodology which bases on representing hand movement trajectory by motion vectors analyzed using fuzzy rule-based inference. The approach was engineered in the system developed with J2SE and C++ / OpenCV technology. OpenCV...
-
Gaze-tracking based audio-visual correlation analysis employing quality of experience methodology
PublicationThis paper investigates a new approach to audio-visual correlation assessment based on the gaze-tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT). The gaze-tracking methodology, having roots in Human-Computer Interaction borrows the relevance feedback through gaze-tracking and applies it to the new area of interests, which is Quality of Experience. Results of subjective...
-
Gesture-based computer control system
PublicationIn the paper a system for controlling computer applications by hand gestures is presented. First, selected methods used for gesture recognition are described. The system hardware and a way of controlling a computer by gestures are described. The architecture of the software along with hand gesture recognition methods and algorithms used are presented. Examples of basic and complex gestures recognized by the system are given.
-
Gesture-based computer control system applied to the interactive whiteboard
PublicationIn the paper the gesture-based computer control system coupled with the dedicated touchless interactive whiteboard is presented. The system engineered enables a user to control any top-most computer application by using one or both hands gestures. First, a review of gesture recognition applications with a focus on methods and algorithms applied is given. Hardware and software solution of the system consisting of a PC, camera, multimedia...
-
Gesture-based computer control system applied to the interactive whiteboard
PublicationIn the paper the gesture-based computer control system coupled with the dedicated touchless interactive whiteboard is presented. The system engineered enables a user to control any top-most computer application by using one or both hands gestures. First, a review of gesture recognition applications with a focus on methods and algorithms applied is given. Hardware and software solution of the system consisting of a PC, camera, multimedia...
-
Long-term comparative evaluation of an acoustic climate in selected schools before and after the acoustic treatment
PublicationThe results of long-term continuous noise measurements in two selected schools are presented in the paper. Noise characteristics were measured continuously there for approximately 16 months. Measurements started eight months prior to the acoustic treatment of the school corridors of both schools. An evaluation of the acoustic climates in both schools, before and after the acoustic treatment, was performed based on comparison of...
Year 2003
-
Extraction of music information based on artifical neutral networks
PublicationW artykule przedstawiono założenia systemu automatycznego rozpoznawania muzyki. Na podstawie przeprowadzonych eksperymentów w artykule przedstawiono efektywność zaimplementowanych algorytmów w zależności od sposobu opisu danych muzycznych. Zaimpementowany system jest oparty o sztuczne sieci neuronowe.
Year 2004
-
Forming and Ranking Musical Rhythm Hypotheses.
PublicationW pracy przedstawiono podstawowe pojęcia i definicje zwiazne z wyszukiwaniem informacji rytmicznej w utworach muzycznych. W muzykologii przyjmuje się, że atrybuty dźwięku, takie jak długość, częstotliwość oraz amplituda dźwięku determinują wagę rytmiczną dźwięku. W artykule przebadano te właściwości fizyczne dźwięku w kontekście okreslenia wagi rytmicznej, czyli miary określającej tendencję dźwięku do znalezienia się na początku...
-
High accuracy and octave error immune pitch detection algorithms.
PublicationW publikacji przedstawiona została metoda poprawiająca dokładność estymacji częstotliwości podstawowej dźwięków naturalnych i syntetycznych. Opracowany algorytm wykorzystuje sztczną sieć neuronową. Dodatkowo przedstawiony został algorytm zoptymalizowany pod kątem błędów oktawowych, operujący w dziedzinie częstotliwości. Przedstawiona metoda jest bardzo skuteczna zarówno dla sygnałów harmonicznych o znaczącej energii poszczególnych...
-
Intelligent methods for musical rhythm retrieval.
PublicationW pracy przedstawiono postać funkcji rankingujacej poszczególnych dźwięków frazy muzycznej. Opracowano metodę tworzenia wszystkich możliwych hierarchicznych struktur rytmicznych, zwanych hipotezami rytmicznymi. Otrzymane hipotezy są następnie porządkowane w kolejności malejącej wartości funkcji rankingującej, aby ustalić, która ze znalezionych hipotez będzie uznana za właściwą strukturę rytmiczną utworu muzycznego. Postać funkcji...
-
IT- enabled comparison of environmental noise levels and noise-evoked hearing impairments.
Publication[Abstrakt] Tematem pracy jest telemetryczny system monitorowania hałasu, opracowany w katedrze Systemów Multimedialnych Politechniki Gdańskiej, przeznaczony do zdalnego monitorowania poziomów hałasu środowiskowego. Oprócz ogólnej charakterystyki systemu zaprezentowano również szereg szczegółów implementacyjnych. Przedstawiono m.in. mobilne urządzenie pomiarowe, oprogramowanie do pomiarów hałasu, dźwiękowy interfejs USB wyposażony...
-
IT- Enabled Comparison of Environmental Noise Levels and Noise-Evoked Hearing Impairments.
PublicationTematem pracy jest telemetryczny system monitorowania hałasu, opracowany w katedrze Systemów Multimedialnych Politechniki Gdańskiej, przeznaczony do zdalnego monitorowania poziomów hałasu środowiskowego. Oprócz ogólnej charakterystyki systemu zaprezentowano również szereg szczegółów implementacyjnych. Przedstawiono m.in. mobilne urządzenie pomiarowe, oprogramowanie do pomiarów hałasu, dźwiękowy interfejs USB wyposażony w mikrofon...
Year 2016
-
Guitar String Sound Retrieved from Moving Pixels
PublicationThe aim of this study was to develop a method of visual recording and analyzing the vibrations of guitar strings using high-speed cameras and dedicated video processing algorithms. The recording of a plucked string reveals the way in which the deformations propagate, composing the standing and travelling wave. The paper compares the results for a few selected models of classical and acoustic guitars, and it involves processing...
-
Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks
PublicationThis paper presents a method for improving users' quality of experience through processing of movie soundtracks. The dialogue clarity enhancement algorithms were introduced for detecting dialogue in movie soundtrack mixes and then for amplifying the dialogue components. The front channel signals (left, right, center) are analyzed in the frequency domain. The selected partials in the center channel signal, which yield high disparity...
-
Koncepcja korekcji sygnału dźwiękowego z uwzględnieniem charakterystyk częstotliwościowych pomieszczenia oraz gatunku muzycznego
PublicationW artykule została przedstawiona koncepcja automatycznego systemu korekcji z uwzględnieniem charakterystyki częstotliwościowej pomieszczenia oraz odtwarzanego gatunku muzycznego. Proponowany algorytm na podstawie charakterystyki częstotliwościowej pomieszczenia dokonuje kompensacji warunków akustycznych w otoczeniu emitera dźwięku. Dodatkowo w procesie kompensacji uwzględniana jest zawartość sygnału poprzez rozpoznanie rodzaju...
-
KORPUS MOWY ANGIELSKIEJ DO CELÓW MULTIMODALNEGO AUTOMATYCZNEGO ROZPOZNAWANIA MOWY
PublicationW referacie zaprezentowano audiowizualny korpus mowy zawierający 31 godzin nagrań mowy w języku angielskim. Korpus dedykowany jest do celów automatycznego audiowizualnego rozpoznawania mowy. Korpus zawiera nagrania wideo pochodzące z szybkoklatkowej kamery stereowizyjnej oraz dźwięk zarejestrowany przez matrycę mikrofonową i mikrofon komputera przenośnego. Dzięki uwzględnieniu nagrań zarejestrowanych w warunkach szumowych korpus...
-
Loudness Scaling Test Based on Categorical Perception
PublicationThe main goal of this research study is focused on creating a method for loudness scaling based on categorical perception. Its main features, such as: way of testing, calibration procedure for securing reliable results, employing natural test stimuli, etc., are described in the paper and assessed against a procedure that uses 1/2-octave bands of noise (LGOB) for the loudness growth estimation. The Mann-Whitney U-test is employed...
-
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
PublicationAutomatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...
Year 2021
-
Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network
PublicationThe goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...
-
KLASYFIKACJA EMOCJI W MUZYCE FILMOWEJ Z WYKORZYSTANIEM TESTÓW SUBIEKTYWNYCH
PublicationCelem referatu było przedstawienie testów odsłuchowych, w których zadaniem osób ankietowanych było przypisanie danego fragmentu muzycznego do odpowiedniej klasy emocji. Kolejne kroki eksperymentu obejmowały wybór muzyki filmowej do testów (baza Epidemic Sound), przygotowanie założeń ankiety oraz modelu emocji wykorzystywanych w testach odsłuchowych, jak również konstrukcj ˛e ankiety. Ankieta została zrealizowana za pomoc ˛a formularzy...